Dataset statistics
| Number of variables | 20 |
|---|---|
| Number of observations | 176573 |
| Missing cells | 176543 |
| Missing cells (%) | 5.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 26.9 MiB |
| Average record size in memory | 160.0 B |
Variable types
| CAT | 10 |
|---|---|
| NUM | 10 |
batsman has a high cardinality: 514 distinct values | High cardinality |
bowler has a high cardinality: 404 distinct values | High cardinality |
non_striker has a high cardinality: 509 distinct values | High cardinality |
player_out has a high cardinality: 487 distinct values | High cardinality |
fielder_caught_out has a high cardinality: 509 distinct values | High cardinality |
season is highly correlated with id | High correlation |
id is highly correlated with season | High correlation |
total_runs is highly correlated with batsman_runs | High correlation |
batsman_runs is highly correlated with total_runs | High correlation |
type_out is highly correlated with replacements | High correlation |
replacements is highly correlated with type_out | High correlation |
replacements has 176543 (> 99.9%) missing values | Missing |
extras_noballs is highly skewed (γ1 = 24.59266034) | Skewed |
extras_byes is highly skewed (γ1 = 29.80374639) | Skewed |
replacements is uniformly distributed | Uniform |
extras_wides has 171230 (97.0%) zeros | Zeros |
extras_legbyes has 173664 (98.4%) zeros | Zeros |
extras_noballs has 175870 (99.6%) zeros | Zeros |
extras_byes has 176097 (99.7%) zeros | Zeros |
total_extras_runs has 167142 (94.7%) zeros | Zeros |
batsman_runs has 71130 (40.3%) zeros | Zeros |
total_runs has 62100 (35.2%) zeros | Zeros |
Reproduction
| Analysis started | 2020-09-24 16:29:20.382580 |
|---|---|
| Analysis finished | 2020-09-24 16:30:24.462608 |
| Duration | 1 minute and 4.08 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 746 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 713160.0962 |
|---|---|
| Minimum | 335982 |
| Maximum | 1178425 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.3 MiB |
Quantile statistics
| Minimum | 335982 |
|---|---|
| 5-th percentile | 336019 |
| Q1 | 501208 |
| median | 598047 |
| Q3 | 980985 |
| 95-th percentile | 1175368 |
| Maximum | 1178425 |
| Range | 842443 |
| Interquartile range (IQR) | 479777 |
Descriptive statistics
| Standard deviation | 284366.5362 |
|---|---|
| Coefficient of variation (CV) | 0.3987415135 |
| Kurtosis | -1.33600124 |
| Mean | 713160.0962 |
| Median Absolute Deviation (MAD) | 205835 |
| Skewness | 0.3585860896 |
| Sum | 1.259248177e+11 |
| Variance | 8.086432689e+10 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 829737 | 262 | 0.1% | |
| 829811 | 259 | 0.1% | |
| 501221 | 257 | 0.1% | |
| 734047 | 257 | 0.1% | |
| 1178423 | 257 | 0.1% | |
| 419142 | 257 | 0.1% | |
| 548367 | 256 | 0.1% | |
| 392190 | 256 | 0.1% | |
| 829805 | 256 | 0.1% | |
| 548353 | 255 | 0.1% | |
| Other values (736) | 174001 | 98.5% |
| Value | Count | Frequency (%) | |
| 335982 | 225 | 0.1% | |
| 335983 | 248 | 0.1% | |
| 335984 | 219 | 0.1% | |
| 335985 | 246 | 0.1% | |
| 335986 | 240 | 0.1% |
| Value | Count | Frequency (%) | |
| 1178425 | 223 | 0.1% | |
| 1178424 | 51 | < 0.1% | |
| 1178423 | 257 | 0.1% | |
| 1178422 | 246 | 0.1% | |
| 1178421 | 242 | 0.1% |
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2013.368386 |
|---|---|
| Minimum | 2008 |
| Maximum | 2019 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.3 MiB |
Quantile statistics
| Minimum | 2008 |
|---|---|
| 5-th percentile | 2008 |
| Q1 | 2011 |
| median | 2013 |
| Q3 | 2016 |
| 95-th percentile | 2019 |
| Maximum | 2019 |
| Range | 11 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 3.323319105 |
|---|---|
| Coefficient of variation (CV) | 0.001650626447 |
| Kurtosis | -1.126435923 |
| Mean | 2013.368386 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.07451207835 |
| Sum | 355506496 |
| Variance | 11.04444988 |
| Monotocity | Increasing |
Histogram with fixed size bins (bins=12)
| Value | Count | Frequency (%) | |
| 2013 | 18152 | 10.3% | |
| 2012 | 17767 | 10.1% | |
| 2011 | 17013 | 9.6% | |
| 2010 | 14489 | 8.2% | |
| 2014 | 14288 | 8.1% | |
| 2018 | 14286 | 8.1% | |
| 2016 | 14096 | 8.0% | |
| 2017 | 13849 | 7.8% | |
| 2015 | 13641 | 7.7% | |
| 2009 | 13595 | 7.7% | |
| Other values (2) | 25397 | 14.4% |
| Value | Count | Frequency (%) | |
| 2008 | 13489 | 7.6% | |
| 2009 | 13595 | 7.7% | |
| 2010 | 14489 | 8.2% | |
| 2011 | 17013 | 9.6% | |
| 2012 | 17767 | 10.1% |
| Value | Count | Frequency (%) | |
| 2019 | 11908 | 6.7% | |
| 2018 | 14286 | 8.1% | |
| 2017 | 13849 | 7.8% | |
| 2016 | 14096 | 8.0% | |
| 2015 | 13641 | 7.7% |
| Distinct | 514 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.3 MiB |
| V Kohli | 4202 |
|---|---|
| SK Raina | 3968 |
| S Dhawan | 3732 |
| RG Sharma | 3732 |
| G Gambhir | 3524 |
| Other values (509) |
| Value | Count | Frequency (%) | |
| V Kohli | 4202 | 2.4% | |
| SK Raina | 3968 | 2.2% | |
| S Dhawan | 3732 | 2.1% | |
| RG Sharma | 3732 | 2.1% | |
| G Gambhir | 3524 | 2.0% | |
| RV Uthappa | 3422 | 1.9% | |
| DA Warner | 3397 | 1.9% | |
| MS Dhoni | 3260 | 1.8% | |
| AM Rahane | 3208 | 1.8% | |
| CH Gayle | 3073 | 1.7% | |
| Other values (504) | 141055 | 79.9% |
Frequencies of value counts
Unique
| Unique | 13 ? |
|---|---|
| Unique (%) | < 0.1% |
Histogram of lengths of the category
Length
| Max length | 23 |
|---|---|
| Median length | 9 |
| Mean length | 9.347589949 |
| Min length | 5 |
| Distinct | 404 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.3 MiB |
| Harbhajan Singh | 3352 |
|---|---|
| PP Chawla | 3133 |
| A Mishra | 3100 |
| R Ashwin | 2966 |
| SL Malinga | 2878 |
| Other values (399) |
| Value | Count | Frequency (%) | |
| Harbhajan Singh | 3352 | 1.9% | |
| PP Chawla | 3133 | 1.8% | |
| A Mishra | 3100 | 1.8% | |
| R Ashwin | 2966 | 1.7% | |
| SL Malinga | 2878 | 1.6% | |
| P Kumar | 2637 | 1.5% | |
| B Kumar | 2631 | 1.5% | |
| DJ Bravo | 2620 | 1.5% | |
| UT Yadav | 2571 | 1.5% | |
| SP Narine | 2545 | 1.4% | |
| Other values (394) | 148140 | 83.9% |
Frequencies of value counts
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Histogram of lengths of the category
Length
| Max length | 23 |
|---|---|
| Median length | 9 |
| Mean length | 9.535931315 |
| Min length | 5 |
innings
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.3 MiB |
| 1st | |
|---|---|
| 2nd |
| Value | Count | Frequency (%) | |
| 1st | 91487 | 51.8% | |
| 2nd | 85086 | 48.2% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
| Distinct | 509 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.3 MiB |
| SK Raina | 4092 |
|---|---|
| V Kohli | 4061 |
| S Dhawan | 4034 |
| RG Sharma | 3771 |
| G Gambhir | 3740 |
| Other values (504) |
| Value | Count | Frequency (%) | |
| SK Raina | 4092 | 2.3% | |
| V Kohli | 4061 | 2.3% | |
| S Dhawan | 4034 | 2.3% | |
| RG Sharma | 3771 | 2.1% | |
| G Gambhir | 3740 | 2.1% | |
| AM Rahane | 3457 | 2.0% | |
| RV Uthappa | 3327 | 1.9% | |
| DA Warner | 3126 | 1.8% | |
| AB de Villiers | 2982 | 1.7% | |
| CH Gayle | 2969 | 1.7% | |
| Other values (499) | 141014 | 79.9% |
Frequencies of value counts
Unique
| Unique | 5 ? |
|---|---|
| Unique (%) | < 0.1% |
Histogram of lengths of the category
Length
| Max length | 23 |
|---|---|
| Median length | 9 |
| Mean length | 9.352426475 |
| Min length | 5 |
| Distinct | 30 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 176543 |
| Missing (%) | > 99.9% |
| Memory size | 1.3 MiB |
| {'role': [{'in': 'CJ Anderson', 'reason': 'injury', 'role': 'bowler'}]} | 1 |
|---|---|
| {'role': [{'in': 'Harbhajan Singh', 'out': 'DL Chahar', 'reason': 'injury', 'role': 'bowler'}]} | 1 |
| {'role': [{'in': 'JM Kemp', 'reason': 'excluded - high full pitched balls', 'role': 'bowler'}]} | 1 |
| {'role': [{'in': 'TM Head', 'reason': 'injury', 'role': 'bowler'}]} | 1 |
| {'role': [{'in': 'AT Rayudu', 'out': 'SR Tendulkar', 'reason': 'injury', 'role': 'batter'}]} | 1 |
| Other values (25) |
| Value | Count | Frequency (%) | |
| {'role': [{'in': 'CJ Anderson', 'reason': 'injury', 'role': 'bowler'}]} | 1 | < 0.1% | |
| {'role': [{'in': 'Harbhajan Singh', 'out': 'DL Chahar', 'reason': 'injury', 'role': 'bowler'}]} | 1 | < 0.1% | |
| {'role': [{'in': 'JM Kemp', 'reason': 'excluded - high full pitched balls', 'role': 'bowler'}]} | 1 | < 0.1% | |
| {'role': [{'in': 'TM Head', 'reason': 'injury', 'role': 'bowler'}]} | 1 | < 0.1% | |
| {'role': [{'in': 'AT Rayudu', 'out': 'SR Tendulkar', 'reason': 'injury', 'role': 'batter'}]} | 1 | < 0.1% | |
| {'role': [{'in': 'Bipul Sharma', 'out': 'Harmeet Singh', 'reason': 'excluded - high full pitched balls', 'role': 'bowler'}]} | 1 | < 0.1% | |
| {'role': [{'in': 'MP Stoinis', 'out': 'Mohammed Siraj', 'reason': 'excluded - high full pitched balls', 'role': 'bowler'}]} | 1 | < 0.1% | |
| {'role': [{'in': 'DL Chahar', 'out': 'KM Jadhav', 'reason': 'injury', 'role': 'batter'}]} | 1 | < 0.1% | |
| {'role': [{'in': 'Yuvraj Singh', 'out': 'KC Sangakkara', 'reason': 'injury', 'role': 'batter'}]} | 1 | < 0.1% | |
| {'role': [{'in': 'R Ashwin', 'out': 'Mujeeb Ur Rahman', 'reason': 'injury', 'role': 'bowler'}]} | 1 | < 0.1% | |
| Other values (20) | 20 | < 0.1% | |
| (Missing) | 176543 | > 99.9% |
Frequencies of value counts
Unique
| Unique | 30 ? |
|---|---|
| Unique (%) | 100.0% |
Histogram of lengths of the category
Length
| Max length | 124 |
|---|---|
| Median length | 3 |
| Mean length | 3.014424629 |
| Min length | 3 |
bowled_over
Real number (ℝ≥0)
| Distinct | 180 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.528801685 |
|---|---|
| Minimum | 0.1 |
| Maximum | 19.9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.3 MiB |
Quantile statistics
| Minimum | 0.1 |
|---|---|
| 5-th percentile | 0.6 |
| Q1 | 4.5 |
| median | 9.4 |
| Q3 | 14.4 |
| 95-th percentile | 18.5 |
| Maximum | 19.9 |
| Range | 19.8 |
| Interquartile range (IQR) | 9.9 |
Descriptive statistics
| Standard deviation | 5.677219708 |
|---|---|
| Coefficient of variation (CV) | 0.5957957669 |
| Kurtosis | -1.180961644 |
| Mean | 9.528801685 |
| Median Absolute Deviation (MAD) | 4.9 |
| Skewness | 0.04965397921 |
| Sum | 1682529.1 |
| Variance | 32.23082361 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 1.1 | 1491 | 0.8% | |
| 0.5 | 1490 | 0.8% | |
| 0.1 | 1490 | 0.8% | |
| 0.3 | 1490 | 0.8% | |
| 2.1 | 1490 | 0.8% | |
| 3.1 | 1490 | 0.8% | |
| 0.6 | 1490 | 0.8% | |
| 0.2 | 1490 | 0.8% | |
| 0.4 | 1490 | 0.8% | |
| 4.1 | 1489 | 0.8% | |
| Other values (170) | 161673 | 91.6% |
| Value | Count | Frequency (%) | |
| 0.1 | 1490 | 0.8% | |
| 0.2 | 1490 | 0.8% | |
| 0.3 | 1490 | 0.8% | |
| 0.4 | 1490 | 0.8% | |
| 0.5 | 1490 | 0.8% |
| Value | Count | Frequency (%) | |
| 19.9 | 5 | < 0.1% | |
| 19.8 | 35 | < 0.1% | |
| 19.7 | 239 | 0.1% | |
| 19.6 | 969 | 0.5% | |
| 19.5 | 1017 | 0.6% |
batsman_team
Categorical
| Distinct | 15 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.3 MiB |
| Mumbai Indians | |
|---|---|
| Royal Challengers Bangalore | |
| Kings XI Punjab | |
| Kolkata Knight Riders | |
| Chennai Super Kings | |
| Other values (10) |
| Value | Count | Frequency (%) | |
| Mumbai Indians | 22149 | 12.5% | |
| Royal Challengers Bangalore | 20770 | 11.8% | |
| Kings XI Punjab | 20684 | 11.7% | |
| Kolkata Knight Riders | 20592 | 11.7% | |
| Chennai Super Kings | 19271 | 10.9% | |
| Delhi Daredevils | 18780 | 10.6% | |
| Rajasthan Royals | 17147 | 9.7% | |
| Sunrisers Hyderabad | 12525 | 7.1% | |
| Deccan Chargers | 9034 | 5.1% | |
| Pune Warriors | 5443 | 3.1% | |
| Other values (5) | 10178 | 5.8% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 27 |
|---|---|
| Median length | 16 |
| Mean length | 17.99051384 |
| Min length | 13 |
| Distinct | 487 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.3 MiB |
| 0 | |
|---|---|
| SK Raina | 157 |
| RG Sharma | 152 |
| RV Uthappa | 151 |
| V Kohli | 142 |
| Other values (482) | 8109 |
| Value | Count | Frequency (%) | |
| 0 | 167862 | 95.1% | |
| SK Raina | 157 | 0.1% | |
| RG Sharma | 152 | 0.1% | |
| RV Uthappa | 151 | 0.1% | |
| V Kohli | 142 | 0.1% | |
| G Gambhir | 136 | 0.1% | |
| S Dhawan | 135 | 0.1% | |
| KD Karthik | 134 | 0.1% | |
| PA Patel | 125 | 0.1% | |
| AM Rahane | 115 | 0.1% | |
| Other values (477) | 7464 | 4.2% |
Frequencies of value counts
Unique
| Unique | 84 ? |
|---|---|
| Unique (%) | < 0.1% |
Histogram of lengths of the category
Length
| Max length | 23 |
|---|---|
| Median length | 1 |
| Mean length | 1.413789198 |
| Min length | 1 |
| Distinct | 509 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.3 MiB |
| 0 | |
|---|---|
| MS Dhoni | 152 |
| KD Karthik | 151 |
| RV Uthappa | 123 |
| AB de Villiers | 113 |
| Other values (504) | 5715 |
| Value | Count | Frequency (%) | |
| 0 | 170319 | 96.5% | |
| MS Dhoni | 152 | 0.1% | |
| KD Karthik | 151 | 0.1% | |
| RV Uthappa | 123 | 0.1% | |
| AB de Villiers | 113 | 0.1% | |
| SK Raina | 110 | 0.1% | |
| PA Patel | 95 | 0.1% | |
| RG Sharma | 90 | 0.1% | |
| V Kohli | 86 | < 0.1% | |
| NV Ojha | 82 | < 0.1% | |
| Other values (499) | 5252 | 3.0% |
Frequencies of value counts
Unique
| Unique | 97 ? |
|---|---|
| Unique (%) | 0.1% |
Histogram of lengths of the category
Length
| Max length | 23 |
|---|---|
| Median length | 1 |
| Mean length | 1.302135661 |
| Min length | 1 |
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.3 MiB |
| 0 | |
|---|---|
| caught | 5219 |
| bowled | 1566 |
| run out | 844 |
| lbw | 530 |
| Other values (5) | 552 |
| Value | Count | Frequency (%) | |
| 0 | 167862 | 95.1% | |
| caught | 5219 | 3.0% | |
| bowled | 1566 | 0.9% | |
| run out | 844 | 0.5% | |
| lbw | 530 | 0.3% | |
| stumped | 280 | 0.2% | |
| caught and bowled | 250 | 0.1% | |
| retired hurt | 11 | < 0.1% | |
| hit wicket | 10 | < 0.1% | |
| obstructing the field | 1 | < 0.1% |
Frequencies of value counts
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Histogram of lengths of the category
Length
| Max length | 21 |
|---|---|
| Median length | 1 |
| Mean length | 1.260288946 |
| Min length | 1 |
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.03682329688 |
|---|---|
| Minimum | 0 |
| Maximum | 5 |
| Zeros | 171230 |
| Zeros (%) | 97.0% |
| Memory size | 1.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.2516125522 |
|---|---|
| Coefficient of variation (CV) | 6.832971881 |
| Kurtosis | 191.5014881 |
| Mean | 0.03682329688 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 11.65890837 |
| Sum | 6502 |
| Variance | 0.0633088764 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=6)
| Value | Count | Frequency (%) | |
| 0 | 171230 | 97.0% | |
| 1 | 4858 | 2.8% | |
| 2 | 229 | 0.1% | |
| 5 | 207 | 0.1% | |
| 3 | 45 | < 0.1% | |
| 4 | 4 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 171230 | 97.0% | |
| 1 | 4858 | 2.8% | |
| 2 | 229 | 0.1% | |
| 3 | 45 | < 0.1% | |
| 4 | 4 | < 0.1% |
| Value | Count | Frequency (%) | |
| 5 | 207 | 0.1% | |
| 4 | 4 | < 0.1% | |
| 3 | 45 | < 0.1% | |
| 2 | 229 | 0.1% | |
| 1 | 4858 | 2.8% |
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.02119803141 |
|---|---|
| Minimum | 0 |
| Maximum | 5 |
| Zeros | 173664 |
| Zeros (%) | 98.4% |
| Memory size | 1.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.1949347214 |
|---|---|
| Coefficient of variation (CV) | 9.195887942 |
| Kurtosis | 241.5230135 |
| Mean | 0.02119803141 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 13.74586003 |
| Sum | 3743 |
| Variance | 0.03799954562 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=6)
| Value | Count | Frequency (%) | |
| 0 | 173664 | 98.4% | |
| 1 | 2536 | 1.4% | |
| 4 | 216 | 0.1% | |
| 2 | 136 | 0.1% | |
| 3 | 17 | < 0.1% | |
| 5 | 4 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 173664 | 98.4% | |
| 1 | 2536 | 1.4% | |
| 2 | 136 | 0.1% | |
| 3 | 17 | < 0.1% | |
| 4 | 216 | 0.1% |
| Value | Count | Frequency (%) | |
| 5 | 4 | < 0.1% | |
| 4 | 216 | 0.1% | |
| 3 | 17 | < 0.1% | |
| 2 | 136 | 0.1% | |
| 1 | 2536 | 1.4% |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.004179574454 |
|---|---|
| Minimum | 0 |
| Maximum | 5 |
| Zeros | 175870 |
| Zeros (%) | 99.6% |
| Memory size | 1.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.07055253733 |
|---|---|
| Coefficient of variation (CV) | 16.88031595 |
| Kurtosis | 1056.80134 |
| Mean | 0.004179574454 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 24.59266034 |
| Sum | 738 |
| Variance | 0.004977660524 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=5)
| Value | Count | Frequency (%) | |
| 0 | 175870 | 99.6% | |
| 1 | 687 | 0.4% | |
| 2 | 9 | < 0.1% | |
| 5 | 6 | < 0.1% | |
| 3 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 175870 | 99.6% | |
| 1 | 687 | 0.4% | |
| 2 | 9 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 5 | 6 | < 0.1% |
| Value | Count | Frequency (%) | |
| 5 | 6 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 2 | 9 | < 0.1% | |
| 1 | 687 | 0.4% | |
| 0 | 175870 | 99.6% |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.004961120896 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 176097 |
| Zeros (%) | 99.7% |
| Memory size | 1.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.1166742622 |
|---|---|
| Coefficient of variation (CV) | 23.51772203 |
| Kurtosis | 971.0286256 |
| Mean | 0.004961120896 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 29.80374639 |
| Sum | 876 |
| Variance | 0.01361288346 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=5)
| Value | Count | Frequency (%) | |
| 0 | 176097 | 99.7% | |
| 1 | 321 | 0.2% | |
| 4 | 121 | 0.1% | |
| 2 | 31 | < 0.1% | |
| 3 | 3 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 176097 | 99.7% | |
| 1 | 321 | 0.2% | |
| 2 | 31 | < 0.1% | |
| 3 | 3 | < 0.1% | |
| 4 | 121 | 0.1% |
| Value | Count | Frequency (%) | |
| 4 | 121 | 0.1% | |
| 3 | 3 | < 0.1% | |
| 2 | 31 | < 0.1% | |
| 1 | 321 | 0.2% | |
| 0 | 176097 | 99.7% |
extras_penalty
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.3 MiB |
| 0 | |
|---|---|
| 5 | 2 |
| Value | Count | Frequency (%) | |
| 0 | 176571 | > 99.9% | |
| 5 | 2 | < 0.1% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.06721865744 |
|---|---|
| Minimum | 0 |
| Maximum | 7 |
| Zeros | 167142 |
| Zeros (%) | 94.7% |
| Memory size | 1.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 7 |
| Range | 7 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.3430137251 |
|---|---|
| Coefficient of variation (CV) | 5.102954122 |
| Kurtosis | 91.09801952 |
| Mean | 0.06721865744 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 8.226940401 |
| Sum | 11869 |
| Variance | 0.1176584156 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=7)
| Value | Count | Frequency (%) | |
| 0 | 167142 | 94.7% | |
| 1 | 8401 | 4.8% | |
| 2 | 404 | 0.2% | |
| 4 | 342 | 0.2% | |
| 5 | 218 | 0.1% | |
| 3 | 65 | < 0.1% | |
| 7 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 167142 | 94.7% | |
| 1 | 8401 | 4.8% | |
| 2 | 404 | 0.2% | |
| 3 | 65 | < 0.1% | |
| 4 | 342 | 0.2% |
| Value | Count | Frequency (%) | |
| 7 | 1 | < 0.1% | |
| 5 | 218 | 0.1% | |
| 4 | 342 | 0.2% | |
| 3 | 65 | < 0.1% | |
| 2 | 404 | 0.2% |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.237431544 |
|---|---|
| Minimum | 0 |
| Maximum | 6 |
| Zeros | 71130 |
| Zeros (%) | 40.3% |
| Memory size | 1.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 4 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.609116193 |
|---|---|
| Coefficient of variation (CV) | 1.300367847 |
| Kurtosis | 1.638279999 |
| Mean | 1.237431544 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.585889292 |
| Sum | 218497 |
| Variance | 2.589254921 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=7)
| Value | Count | Frequency (%) | |
| 0 | 71130 | 40.3% | |
| 1 | 65416 | 37.0% | |
| 4 | 20075 | 11.4% | |
| 2 | 11292 | 6.4% | |
| 6 | 8035 | 4.6% | |
| 3 | 569 | 0.3% | |
| 5 | 56 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 71130 | 40.3% | |
| 1 | 65416 | 37.0% | |
| 2 | 11292 | 6.4% | |
| 3 | 569 | 0.3% | |
| 4 | 20075 | 11.4% |
| Value | Count | Frequency (%) | |
| 6 | 8035 | 4.6% | |
| 5 | 56 | < 0.1% | |
| 4 | 20075 | 11.4% | |
| 3 | 569 | 0.3% | |
| 2 | 11292 | 6.4% |
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.304650201 |
|---|---|
| Minimum | 0 |
| Maximum | 7 |
| Zeros | 62100 |
| Zeros (%) | 35.2% |
| Memory size | 1.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 4 |
| Maximum | 7 |
| Range | 7 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.597156266 |
|---|---|
| Coefficient of variation (CV) | 1.224202675 |
| Kurtosis | 1.579171701 |
| Mean | 1.304650201 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.555697515 |
| Sum | 230366 |
| Variance | 2.550908138 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=8)
| Value | Count | Frequency (%) | |
| 1 | 73180 | 41.4% | |
| 0 | 62100 | 35.2% | |
| 4 | 20337 | 11.5% | |
| 2 | 11894 | 6.7% | |
| 6 | 7988 | 4.5% | |
| 3 | 672 | 0.4% | |
| 5 | 354 | 0.2% | |
| 7 | 48 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 62100 | 35.2% | |
| 1 | 73180 | 41.4% | |
| 2 | 11894 | 6.7% | |
| 3 | 672 | 0.4% | |
| 4 | 20337 | 11.5% |
| Value | Count | Frequency (%) | |
| 7 | 48 | < 0.1% | |
| 6 | 7988 | 4.5% | |
| 5 | 354 | 0.2% | |
| 4 | 20337 | 11.5% | |
| 3 | 672 | 0.4% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| id | season | batsman | bowler | innings | non_striker | replacements | bowled_over | batsman_team | player_out | fielder_caught_out | type_out | extras_wides | extras_legbyes | extras_noballs | extras_byes | extras_penalty | total_extras_runs | batsman_runs | total_runs | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 335988 | 2008 | AC Gilchrist | GD McGrath | 1st | JC Buttler | NaN | 0.1 | Rajasthan Royals | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 |
| 1 | 335988 | 2008 | AC Gilchrist | GD McGrath | 1st | AM Rahane | NaN | 0.2 | Rajasthan Royals | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 2 | 335988 | 2008 | AC Gilchrist | GD McGrath | 1st | AM Rahane | NaN | 0.3 | Rajasthan Royals | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 4 |
| 3 | 335988 | 2008 | Y Venugopal Rao | GD McGrath | 1st | AM Rahane | NaN | 0.4 | Rajasthan Royals | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 4 | 335988 | 2008 | Y Venugopal Rao | GD McGrath | 1st | AM Rahane | NaN | 0.5 | Rajasthan Royals | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 6 | 6 |
| 5 | 335988 | 2008 | Y Venugopal Rao | GD McGrath | 1st | AM Rahane | NaN | 0.6 | Rajasthan Royals | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 6 | 335988 | 2008 | AC Gilchrist | Mohammad Asif | 1st | JC Buttler | NaN | 1.1 | Rajasthan Royals | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 7 | 335988 | 2008 | AC Gilchrist | Mohammad Asif | 1st | JC Buttler | NaN | 1.2 | Rajasthan Royals | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 8 | 335988 | 2008 | AC Gilchrist | Mohammad Asif | 1st | JC Buttler | NaN | 1.3 | Rajasthan Royals | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 4 |
| 9 | 335988 | 2008 | AC Gilchrist | Mohammad Asif | 1st | JC Buttler | NaN | 1.4 | Rajasthan Royals | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 4 |
Last rows
| id | season | batsman | bowler | innings | non_striker | replacements | bowled_over | batsman_team | player_out | fielder_caught_out | type_out | extras_wides | extras_legbyes | extras_noballs | extras_byes | extras_penalty | total_extras_runs | batsman_runs | total_runs | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 176563 | 1178424 | 2019 | LS Livingstone | NA Saini | 2nd | MS Dhoni | NaN | 18.3 | Chennai Super Kings | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
| 176564 | 1178424 | 2019 | LS Livingstone | NA Saini | 2nd | SW Billings | NaN | 18.4 | Chennai Super Kings | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 176565 | 1178424 | 2019 | SV Samson | K Khejroliya | 2nd | SW Billings | NaN | 18.5 | Chennai Super Kings | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 2 |
| 176566 | 1178424 | 2019 | SV Samson | K Khejroliya | 2nd | SW Billings | NaN | 18.6 | Chennai Super Kings | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 4 |
| 176567 | 1178424 | 2019 | LS Livingstone | K Khejroliya | 2nd | MS Dhoni | NaN | 19.1 | Chennai Super Kings | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 4 |
| 176568 | 1178424 | 2019 | SV Samson | K Khejroliya | 2nd | MS Dhoni | NaN | 19.2 | Chennai Super Kings | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 4 |
| 176569 | 1178424 | 2019 | SV Samson | K Khejroliya | 2nd | MS Dhoni | NaN | 19.3 | Chennai Super Kings | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 2 |
| 176570 | 1178424 | 2019 | SV Samson | K Khejroliya | 2nd | MS Dhoni | NaN | 19.4 | Chennai Super Kings | SW Billings | 0 | run out | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 176571 | 1178424 | 2019 | LS Livingstone | YS Chahal | 2nd | MS Dhoni | NaN | 19.5 | Chennai Super Kings | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
| 176572 | 1178424 | 2019 | SV Samson | YS Chahal | 2nd | DJ Bravo | NaN | 19.6 | Chennai Super Kings | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |